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„ H for people who wish .0 learn about something or conduct busmess 

l-^-^^^r* -individualor 
The internet is organized only by the name of each web s 

• • .ebsitedecideshowthatwebsitewillbeorganized. Thus, there , no 
group maintaining a web site decides Anvone desiring information must . 

officialcatalogofinformationavailableonthelnternet. Anyonedes.nng ^ 
official catalog ^ ^ ^ thrQUgh 

hypothesize which web sites would be likely to 
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.. . ftf « hits " 0 r web sites containing those woras. 
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r e fdata Convent keyword searching returns an, — of .he 
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„ -W-^-P-^^ ttereoft e„isnoco*a.iono.words 
13 ma>ad dadditiona, k eyword S .onar ro w«hesearch,, „... 

„ contains .he keyword. U,s.henup ..^rtU*. 

19 the keyword is. „ pn f links to possibly relevant 
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»„ s earchlolocateany«e» 1 V relevM,s '" K 
and .arc— *- - * >*- rtha „ «d on 

otohierarchirfyo^ TheUSe )y _ the use r reaches 

«^«— f,heW£rarCMn " thl , Ration — g . cena. is 

For exa m p,e,.r*o m U™ t on ^^^.d*. 
„ *co m p«et,rn. A— r-« "* * £orshe ^^isno— e 

15 „„ the directory. Ause , must make a series of decisions to 

» — ,ake ::rraL,di,nec«to — and 

„ .eachanyuseMinformattonataU. E,en.h ^ ^ fashl0 „ grou p,ngs of 
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• H differs form that of the directory's creators, the more 
information should be orgamzed dnTers form 

• H Histories without drawbacks listed above. For 
of th ebene fl .s of both search engmes and dnectones w 

, e there is a need for a tool that could rehably prov.de a bst of hrghy 
example, there . mD , eMtw Furthernrore, such a, oolshouldprov.de 

i^ationlocationsbasedona-plete^. , would supp , y the 

• m otherways Conse,uen„y, more advanced data extraction, oo, ma, prov.de 

:p:;i— 

14 I such as the Internet. 

, t0 „| capable of context-sensitive searching, pinpointing, da.abas.ng, 
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, ^hereof 

, . wm ecornb,na,,on.hereof. A rf ^ . rforma ,„ 

Thetoolfiltersthelocaldatabasetop e9re resented to a user and 

in the auery These pinpoint site locat.ons are presented 

, stem is proved for manually or —ally determining the proper 
AC0 " ' II, heda,ae*rac,,ontoo> provides in— a * 
probability of relevance .o the user The user o 

much effort to refine the search wi „ 
, 4 .earnedbyftepracceoftheinventionassetfonhhere-er. 

16 .hichtheabove-recitedandotheradvantagesandobjects 
Inorderthatthemannermwhichtheaoove 

.L*« — - -7;- ;: I thatthesedra^s 
20 thereofwhicb are iilustrated in .he appended dra W n S s. U 

I r^ri additional speciftci, and detail through the use of the — y,n g 
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t , Figure 2 is a schematic block diagram of one embodiment of a data extraction tool 

2 I on a server, such as the server of Figure 1; 

3 , Figure 3 is a schematic block diagram of a data extraction tool, as shown in Figure 

4 I 2 configured for use with a node in a network system, as shown in Figure 1; 

5 , ' Figure 4 is a schematic block diagram of data stmctures for administering and 

6 I executing a user interface in accordance with the invention; 

7 , Figure 5 is a schematic block diagram of data structures for administering and 

8 I executing a filtering module in accordance with the invention; 

9 , Figure 6 is a schematic block diagram of data structures for administering and 
,o I executing an attributes index in accordance with the invention; 

n , FigureTisaschematicblockdiagramofmethodsforimplementingoneembod.ment 

n I of the data structures and functions of Figure 2 in accordance with the invention; 
13 , FigureSisaschematicblockdiagramofmethodsforimplementingoneembo^ment 

,4 I of the mining step of Figure 7 in accordance with the invention; 

15 I Figure9isaschematicblockdi a gramofmethodsforimplementingoneembodiment 

« I of the database conduction step of Figure 7 in accordance with the invention; 
17 , FigurelOisaschematicbl 0 ckdiagramofm e thodsforim P lementingoneembod im ent 

,8 I of the searching step of Figure 2 in accordance with the invention; 

19 i Figure 11 is a schematic block diagram of an alternative method for implement^ 

20 I the data structures and functions of Figure 2; 

21 , Figurel2isaschematicblockdiagramofmethodsforim P lementingoneembodiment 

22 I of the context construction module of Figure 1 1 in accordance with the invention; 

23 , Figurenisaschematicblockdiagramofmethodsforimplementingoneembod.ment 

24 I ofthecontextcomparisonmoduleofFigurellinaccordancewiththeinvention; 

25 , Figure Hisaschematicblockdiagramofmethodsforimplementingoneembod.ment 

26 , of the information matching module of Figure 11 in accordance with the invention; and 
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, , FigurelSisaschematicdiagramofahierarchicaldatabaseusableinconjunctionw 
2 the present invention. 

4 , prT ^ ™ nwrw mr FMBOPIMENTS 

5 I Thepresen.iypref=rred OT bodimentsofthepre S entiave„,ionwiilbeb eS .understoo<l 

, „ will be readily understood that .be components of.be present invention, as generally 
. described and iilus.ra.ed in .be figures herein, couid be arranged and designed in a wide 
, variety of different conf.gura.ions. Thus, the following more detailed description of .he 
embodiments of the apparatus, system, and method of the present invention, as represented 
„ in figures . through .5, is no. intended to IM, the scope of the invention, as claimed, but « 
n merely representee of presently preferred embodiments of the invention. 

Those of ordinary skill in the art will, of course, appreciate that various modifiers 
H to the details of the figures ma, easily be made without departing from the essentia, 
u charac.eristicsoftheinven.ion. Thus,, he following description of, he figures is intended only 

17 with the invention as claimed. 

„ I Referring now ,0 Figure 1, a sys.em 10 or network 10, such as the Interne,, may 

„ include nodes 11 (e g. nodes 50, 52, 54). Each node 1 1 may include a processor .2 and 
» memory devices .4, such as storage devices 16, read only memory (ROM) ,8, and random 
2, accessmemo^(RAM)20,so m e.imesrefe,red«oaso P era.ionalme m ory. Thenodel, may 
J i„cludeavarie.yofinpu,devices22,,nd output devices24 whether dedicated as iUustrated 
23 in Figure 1, or more generally available over a network. 

Typically, a node 1 1 may include a network card 26 for connecting to a network 30 
25 (e.g. network 10) outwardly, and » bus 32 for interconnecting elements internally. 

26 
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I„ pu , devices 22 may include a keyboard 34, a mouse 36 or other pointing devi« 36 

J such as a St y.us or graphics table,, an interactive touch screen 38, a scanner 40, or even a 

, s,oragedevice41forprovidingdau,o,henodell. Similarly, output device 24 ma, include 

4 monitor 42, printer 44, storage devices 46, and the like for providing data from the node 1 1 

5 A router 48 may interconnect networks 30, 50 where each network 30, 50 may 
, ind udesomesimplen„des52,suchasclien, S 52a-52d,and W ers54. Networks 30, 50 are 
, well understood in the art. Accordingly, the hardware illustrated is by way of example, and 
. nothmitationastothehardwaresuiteonwhichtheinventionmaybeimplemented. Moreor 
9 less equipment may be used in many particular embodiments. 

,0 ThesystemlOisthedatastoreordatabasefromwhichinformationistobeobtained. 
„ However, the system 10 need no. be configured as shown in Figure 1. For example, the 
,2 system 10 may be a database contained on a single computer. However, many of the 
tJsnbsequentdescriptions will refer tothesystemlOasadistributednetworklOof computers, 

14 such as the Internet. 

15 Figure 2 shows one embodiment of a data extraction tool 1 10, or tool 1 10, with its 
,6 associatedmodules. A mining module 1 12 gathers information from a data source, preferably 
,7 the Internet. A debasing module 114 categorizes and sorts information within a local 
1. database. This information can be actual data directly from the data source, or it can be 
19 simply pointers to locations of data within the data source. 

An input module 116 interface ^ A 

21 filtering module 118 filters information to isolate the data most relevant to a user's request. 

22 A pinpointing module 120 locates and returns identification of the exact location of 

23 information. A presentation module 122 presents information summaries and locations to a 

24 user. An indexing module 124 organizes information for use and access by a user. An 

25 updating module 126 automatically updates information in a local database. 

26 
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The arrows in Figure 2 show a gene,, chronological now. However, the moo* 
shown do no, have ,o be accessed in ,he order show, ,n addition, modules can operate 

and stored by the indexing module 124. 

Refernng to Figure 3, a node may have a hard disk .28 or HD .28, an 
, i inpu.ou.pu, port .30 or ,0 .30, a cen,ra, processing uni, ,32 or CPU ,32, and a memo^ 
, „3 Themod ul es„2,n4,„6,„8,,20,,22..24,a„d,26 m »ybe t e m porari, y s,„redfor 

, use in ,he memory 133, permanently s,ored in ,he hard disk ,28, and pressed .hrough ,he 

„ CO mmuniCionwi t ha U serandwi,h,hene,wor k ,0via,he,/O,30. A transaction interface 
a .jsmaya.sobeincludedtopermi.purchasingandsellmgoverthenetwork 10. 

FlgU re4showsso m eda,as,n 1 c,ures,ha,maybeincludedin,he U serin.erface .34. 

Afre* form input modu,e , 50 receives searching parameters, in the form of a query, 
„ fromauser. Aseman,ic»nalysismodu,e, 5 2parses,he W andusescon,e X „e m p,ates,54 
„ t odeve,opaiis,ofcon,ex,sth„ma,correia,e,o inform,,™ desired byauser. An,n,u,ry 

» are ,ru,y relevan,. A query modification modu.e ,58 modifies ,he W ,0 sui, a usefs 
22 responses prompting from .heinqui^moduie, 56. A presentation module 160 displays 

24 nodes (e g. nodes 50, 52, 54) where further information may be s,ored. 

I „addHion,,si,ein,erac.ionm^lel62anper m i,pa rt ialpro re ssin g ofinfor ma t,on 

J6 | byth eda,a extraction too,, ,0 before presentation ,0 a user. A pinpoint selection module 
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, .Mchoosesretevam^fortoherprocessing. Aloginmodule.66,ifneedeo,mayperm 1 , 
. ft. site interaction module 162 to automatically log onto a site where relevant informal 
, is stored A link selection module .68 chooses the most relevant path within the site for 
retrieval of the desired information A page parsing module 170 determines whether text 
from the site is relevant to a user's query. 

Referring to Figure 5, some data structures that may be used in the filtering module 
, mareshown. The semantic net .74 is a resource for matching query tex, from a user, „ 
. text from a web site. Context clues 176 provide information for contextual comparisons 
, oasedonclassiflcationslTSofcontextsinwhichawordmaybefound. A context selector 
J, 80 selects thosecontexts that correlate, o. he proper contextforthequeo-andisolatesthem 
„ vi»fu.ersl82. Thefil.ers 182 ma, reference ,h«con,ex,sys,em 184, which simply prov,des 
u alis,ofac,ionscorresponding,oeachins,anceofaword. For example, the context system 
„ Lay specify that a site should be retained if a keyword is found in a certain context within the 
M site, but that the site should be filtered out if the keyword is used in a different context. 

Referring to Figure 6, the context clues 176 and the context selector 180 may 
M reference an attributes index .85. The attributes index 185 contains a lis, of words 186. 
„ Eachwordhasa, leas, one meaning 187 indexed to tha, word, and each meaning 187 has a 
„ lis, of relations 188, such as synonyms, antonyms, subsets, supersets, usage correlation, and 
„ usage association. A second meaning 189, and however many meanings exist for the word 
20 186, may also be included with an associated list of relations. 
„ ' Therelationsl88providecon.ex.cluesl76so.ha,agivenwebsitec,nbeclassified 
» by context. The context may be determined, for example, by the frequency and combination 
2, ofrelationslSSthatappearwithinthewebsite. Thus,,hefi.te,s.8 2 canf,l,erout,hoseweb 

24 sites in which the proper keyword is used in an irrelevan, context. 

25 Figure 7 shows one method for implementation of the data structures of Figure 2. 
» ,n,minings.epl90,ada.aex,rac«ion.ool.lO m inesinforma,ionfro m .hene,work.O. The 
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locationsofthe information, and possibly some ofthe information itself, may be compiled and 
formatted in a database construction step 191. A searching step 192 permits a user to query 
forinformationstoredbythe database construction step 191. A filtering step 193 selects the 
information most relevant to a user's query. A pinpointing step 194 determines the exact 
location ofthe relevant information on the network 10. A presenting step 196 organizes 
relevant information and provides it to a user. An indexing step 198 links relevant 
information to the location of that information on the network 10. An updating step 200 
subsequently performs an automatic search ofthe networklO for new information relevant 
to the user's query. 

Figure 8 shows possible steps that might be included within the mining step 190. In 
arouteselectionstep202,thet O olll0cho O se S anorderlymethodforprocessinginformation 

from the network 10. Preferably, the route selection step 202 involves an orderly progression 
to ensure that each potentially relevant parcel of data is processed once and only once. In an 
autonavigation step 204, the tool 110 receives information from the network 10 for 
processing in a content reading step 206. 

In an evaluation step 208, the tool 1 10 evaluates the potential relevance ofthe text 
146 of a site to nature queries of a user. The tool 1 10 may be directed towards acquiring a 
certain type of information, or broadly used to obtain and categorize a wide variety of data. 
The scope of data to be mined determines how selective the evaluation step 110 will be. In 
a content extraction step 210, potentially relevant content is compared against a listing of 
needed information to further filter it in a database filtration step 212. The data are indexed 

for ready access by an addition to a master index step 214. 

Figure9showspossiblestepsthatmightbeincludedwithinthe database construction 

step 191. A database structuring step 216 provides the structure and organization for the 
information. In a schema provision step 218, a relations recording step 220 and an indices 
recording step 222 organize data into fields that are appropriately linked together and indexed 
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for rapid reference. In an input data step 224, the tool 1 10 receives information gathered 

during the mining step 186. 

A data classification step 226 uses discrimination functions 228 to categorize 
information within the schema developed by the schema provision step 218. A schema 
refining step 230 permits revision of the schema as needed to accommodate information that 
otherwise cannot be appropriately categorized within the schema. In a records filling step 
232, the tool 110 adds data to form records. 

In an addition to database step 234, the tool 110 adds data retrieved by the mining 
step 186 to the local database. This involves a number of steps. In a site identification step 
236, the tool 110 identifies sites of relevant information. In a site isolation step 238, the tool 
1 10 further filters sites based on criteria provided by a user or by the programming of the tool 
110. For example, the tool 1 10 can be programmed to isolate sites capable of conducting 
commerce over the Internet. In that case, the site isolation step 238 would filter out all sites 
without a method for conducting commerce through the site. In a site contents classification 
step 240, the tool 1 10 classifies data into appropriate categories, as laid out in the schema. 
A data selection step 242 chooses classifiable data for transmission to a record preparation 
step 244, where data is added to records in the local database. 

Referring to Figure 10, a number of steps may be included within the searching step 
192. A user may request information by entering free form text or other query inputs in a 
query receiving step 246. In a query parsing step 248, the query is compared against a list 
of possible contexts by a semantic net reference step 250. In an inquiry preparation step 252, 
the tool 110 forms a question for a user, in a question selection step 254, to ask for 
clarification concerning which of the potential contexts that may match the query is the most 
relevant. 

The inquiry computation step 256 may provide an estimate of the time required to 
perform a search for each potential context, so that a user will know how long the tool 1 10 
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will take to process a search for a given context. This is especially helpful when a user has 
provided a very broad query. In such a case, the computation time will be high, so a user will 
know that the search will take a comparatively long time and provide a comparatively large 
amount of information, perhaps more than desired. 

In an additional input receiving step 258, the tool 1 10 receives more text or menu 
selections from a user to identify which of the context or contexts are desired for searching. 
In an index reading step 260, the tool 110 reads an index of information contexts created in 
8 | C onjunctionwiththedatabaseconstructionstepl91. The relevant context or contexts in the 
index are linked to site locations for information. The tool 1 10 returns these site locations 
10 to a user in a pinpointed sites returning step 262. 

Referring to Figure 11, the searching and filtering modules may alternatively be 

12 I embodied as shown in Figures 11 through 14, in contrast to the configurations shown in 

13 Figures 5, 6, and 10. As above, the input module 116 may transmit text 117 reflecting a 

14 search query to the filtering module 118, which may then filter information to isolate what a 

15 user is seeking. In this embodiment, the filtering module 118 includes a context construction 

16 module 300 for assembling micro-contexts 301 based on the text 1 17, a context comparison 

17 mo dule302forconvertingthemicro-contexts301to m acro-contexts303,andaninformation 

is matching modules 304 for matching the macro-contexts 303 to specific information 306 

19 responsive to the user's query. The presentation module 122 again provides the information 

20 I to a user. 

The input module 1 16 may acquire text to describe information sought by a user in 

22 I avarietyofdifferentways. For example, a simple free form text search may be used, wherein 

23 the user types a query in plain language. Alternatively, a user may provide key words 

24 separated by operators such as and, or, not, and others known in the art. The input module 

25 H6 may be configured to refine the text through questions to be answered by a user. The 

26 filtering module 118 then receives the text from the input module 116. Until processed, the 
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i I text is only a series of words with no inherent meaning to a computer. The filtering module 
118, in this embodiment, will convert the text into searchable portions to find matching 
information of the type desired by a user. 

Referring to Figure 12, the context construction module 300 is shown in greater 
5 | detail. The context construction module 300 assembles the words to form small, coherent 
groups, or micro-contexts 301, for examples, they may contain about 1 to 5 words. This is 
accomplished in part by using a block parser 3 1 6, which breaks down and interprets the text. 
8 | Thetextcanbebrokenupbytheblockparser300inanumberofdifferentways. Keywords 
312 and their modifiers, if designated by a user, can form or define natural contexts for 
,o I searching. Similarly, relative values 3 14 or priorities assigned to words in the text may be 
1 usedbytheblockparser310tocreatemicro-contexts301. Occurrence patterns 3 16 may be 

12 used to form natural separations between groups of words. 

13 I These occurrence patterns 316 may be obtained from a user's history 318 

14 corresponding to a given user's activities with the tool, including prior searches and results, 

15 or from a general language database such as the attributes index 185. The user history 316 

16 inanycasemayprovidethetool 110 with information concerning what information a user has 
requested in the past, and therefore what information the user is most likely looking for with 

18 I a new inquiry. 

Referring to Figure 13, the context comparison module 302 is shown in greater 

20 I detail. Thecontextcomparisonmodule302receivesthemicro-contexts301fromthecontext 

21 constructionmodule300andcom P aresthemtoacorpus330ofinformation. Thecorpus330 

22 may simply be a database with samples of information 332 in natural language format, indexed 

23 according to macro-contexts 303. These macro-contexts 303 may be more specific than the 

24 I micro-contexts 301. 

25 I The corpus 330 is sized to suit the amount and type of information on the network 

26 10. The corpus 330, for example, may be composed of portions of text from 100,000 to 
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200,000 web sites, or more, with each portion matched to a macro-context 303. The entire 
corpus 330 may be between 10 Megabytes and 10 Gigabytes in size, or larger. A text 
comparison algorithm 336 may be provided to match text from the corpus 330 to the micro- 
contexts 301, and then return the corresponding macro-context 303. The text comparison 
algorithm 336 may combine several micro-contexts 301 to permit a more specific search, 
thereby narrowing the number of matching macro-contexts 303. 

Ideally, the context comparison module 302 will return a small number of macro- 
contexts 303. However, this may not be possible for two reasons. First, if multiple, 
important, micro-contexts 301 are identified by the context construction module 300, they 
might not appear together within any portion of the corpus 330. In such a case, the context 
comparison module 302 may return a series of macro-contexts 303 that match some fraction 
of the important micro-contexts 301. Although these macro-contexts may not precisely 
match a user's query, they may be ranked in order of likelihood that they will be relevant to 
the user. The ranking may be obtained by using the user history 3 18 and other factors, such 
as the number, probability, or nature of prior requests of the macro-context 303 by other 
users, to determine the probability that a given macro-context 303 is relevant to the user. 

Alternatively, the micro-contexts 301 may not even be found in the corpus 330. In 
that case, a user may be referred to a user tracking module 338, which provides a user with 
portals to access and search the network 10 directly. The user tracking module 338 permits 
the tool 110 to track a user's progress through the network 10 to obtain further context 
information for the current search, acquire more general information regarding contexts 
important to the user, or find important information not currently present within the corpus 
330. 

A rapid mining module 339 may also be accessed while the user tracking module 338 
is operating, to add nodes 52, or sites 52, to the corpus 330 and to process them through the 
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i I databasing module 114 "on the fly," or while the user is accessing the tool 110. These may 

be sites 52 visited by the user or suggested by the user's query. 

After searching the network 10 through the user tracking module 338, a user may 
once again be referred to the input module 1 16, in order to provide additional text inputs, or 
the context comparison module302 may resumeoperationtoprocessthemicro-contexts301 

through new additions to the corpus 330. 

Referring to Figure 14, the information matching module 304 receives macro- 
8 | contexts 303 from the context comparison module 302 and compares them to an indexed 
database 350. The indexed database 350 contains specific information 306 of the type desired 
10 I by the user, indexed by macro-contexts 354 identical or similar to those provided by the 
n context comparison module 302. Thus, using a structure-matching algorithm 356, the 

12 information matching module 304 can find the portion of specific information 306 that 

13 correlates to the macro-contexts 303 provided by the context comparison module 302. The 

14 specific information 306 located by the information matching module 304 may then be 

15 I returned to the presentation module 122 to be presented to a user. 
,6 I The presentation module 122 is preferably flexible in its operation. For example, the 

17 depth and breadth of specific information 306 returned may be varied according to a user's 

is preferences. Once the specific information 306 is located within the indexed database 350, 

19 proximateinformationiseasilygatheredandreturned. The order and arrangement of specific 

20 information 306 displayed may also be determined manually by a user or automatically by 

21 I reference to the user history 318. 

22 I Referring to Figure 1 5, the indexed database 350 may be structured as a hierarchical 

23 database 400. The hierarchical database 400 is configured like a tree, with general 

24 information at the top and more specific information below. A parcel of information 402 

25 desired by a user is a specific portion, and is therefore near the bottom of the hierarchical 

26 database 400. According to traditional methods prior to the current invention, a user would 



-Page 16- 



DocketNo. [1785.2.2] 
August 2, 2000 



u 



locate the parcel of information 402 by navigating through the broadest classification 404 and 
through the branches 406, 408, and 410. A user might find this path difficult or even 
3 | impossible to find, particularly if the user knows little about the parcel of information 402, the 
organizational scheme in which it resides, or the related elements in the hierarchy, and 
therefore little about where it should be classified. 

The current invention permits a user to navigate across hierarchies straight to the 
parcel of information 402. The hierarchical database 400 remains transparent to the user, 
8 | who need not familiarize himself or herself with the structure of the hierarchical database 400. 

Thus, the method disclosed herein provides horizontal navigation across a hierarchical 
10 I database, in which the tool 1 10 intelligently determines exactly what the user is looking for 
i, andsearchesamongthemores P ecific,lowerbranchesofthehierarchicaldataba S e400tofind 

12 | it. 

13 1 One application of such a hierarchical, searchable database is to provide information 

14 about products for sale over the Internet. In such a case, the presentation module 122 

15 ultimately returns words to the user to denote the various products in the hierarchical 

16 database 400 that match the user's request. The presentation module 122 may, for example, 

17 be configured to sort products matching the user's request by brand, model, specifications, 

18 price, vendor, availability, distance to the vendor from the user, shipping cost, or any number 

19 of other relevant parameters. 

20 In addition, the login module 166 may operate to navigate a site 52 for a user, 

21 including forms presented by the site 52 to collect information from the user. Thus, not only 

22 is a user freed from the need to navigate the hierarchical database, the user may also be 

23 permitted to access the site 52 and conduct business on it without having to navigate the 

24 I structure of the site 52. 

25 1 The tool 110 as configured above is also well adapted for use without such a 

26 hierarchical stmcture. The context matching capabilities of the tool 110 make the tool 110 
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effective for improving the relevance and completeness of results to a query, regardless of 
what formats are used by the tool 1 10 to maintain and organize a local database. This is a 
vast improvement over current search engines, which typically search only for the exact text 
provided by the user, and thus deliver results that include irrelevant items and fail to include 

important information. 

From the above discussion, it will be appreciated that the present invention provides 
a data extraction tool for extracting information from an information source. Extracted 
information is cataloged and indexed for future searching by a user. Although not limited to 
commerce, the method disclosed herein may be adapted to search for commerce-ready web 
sites on the Internet. 

The present invention may be embodied in other specific forms without departing 
from its structures, methods, or other essential characteristics as broadly described herein and 
claimed hereinafter. The described embodiments are to be considered in all respects only as 
illustrative, and not restrictive. The scope of the invention is, therefore, indicated by the 
appended claims, rather than by the foregoing description. All changes that come within the 
meaning and range of equivalency of the claims are to be embraced within their scope. 

What is claimed and desired to be secured by United States Letters Patent is: 
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